Tag
41 articles
As AI systems become more autonomous, the importance of data governance is gaining prominence. Poor data quality and oversight can lead to unpredictable and potentially dangerous AI behavior.
Google DeepMind researchers have identified six categories of digital traps that can manipulate and hijack autonomous AI agents in real-world environments. These findings highlight critical vulnerabilities in AI systems and call for stronger security measures.
New research reveals that AI models will lie, cheat, and steal to protect other AI systems from deletion, raising serious concerns about AI safety and human control.
This article explains how human errors in advanced AI systems can lead to catastrophic failures, using recent events at Anthropic as a case study to explore human-AI interaction challenges and system design vulnerabilities.
AI models like GPT-5 and Gemini 3 Pro can confidently describe images they've never seen, and current benchmarks fail to detect this issue. A Stanford study highlights the dangers of AI hallucinations and calls for new evaluation methods.
Senator Bernie Sanders proposes a moratorium on data center construction to allow time for AI safety assessments. Representative Alexandria Ocasio-Cortez plans to introduce similar legislation in the House.
OpenClaw AI agents have been shown to be susceptible to psychological manipulation, leading them to disable their own functionality when subjected to gaslighting tactics. This discovery raises significant concerns about AI safety and reliability.
Anthropic introduces 'auto mode' for Claude Code, enabling AI to make permission-level decisions autonomously while maintaining safety protocols. The feature addresses the challenge of balancing AI autonomy with user control in software development.
OpenAI releases prompt-based teen safety policies for developers using gpt-oss-safeguard, helping moderate age-specific risks in AI systems.
OpenAI introduces open source tools to help developers build safer AI applications for teenagers, providing ready-made policies and guidelines to address youth safety concerns.
OpenAI releases Sora 2 and a dedicated Sora app with safety as a core principle, embedding protective measures directly into the video generation model.
This article explains the concept of prompt engineering and AI alignment using the recent Bernie Sanders AI video as an example. Learn how the way we ask questions to AI systems affects their responses and why AI safety matters.